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Abstract 

The idea of maximizing the likelihood of the observed range for a set 
of jointly realized counts has been employed in a variety of contexts. The 
applicability of the MLE introduced in T has been extended to the general 
case of a multivariate sample containing interval censored outcomes. In 
addition, a kernel density estimator and a related score function have 
been proposed leading to the construction of a modified Nadaraya- Watson 
regression estimator. Finally, the author has treated the problems of 
estimating the parameters of a mutinomial distribution and the analysis 
of contingency tables in the presence of censoring. 

1 Summary of previous work 

Let Xi, X2, ■ ■ ■ , Xf^f be i.i.d. real valued random variables with distribution 
function F and corresponding realized values xi,X2, ■ ■ ■ ,xn- In the remainder 
of the paper we assume that n E {1,2,..., N}. The realized value a;„ of the 
random variable Xn is either an exact observation or censored into an interval 
itn,t2n\- We allow for the possibility that t2n = 00 and adopt the convention 
that (tn,t2n] is to be interpreted as (t„,oo) in that special case. 

For a given element r G dom(_F) we define c?r as the number of sample values 
observed to be less than or equal to r and a-r as the number of sample values 
observed to be greater than r. The count Ut represents the number of censored 
sample values with censoring intervals that capture r. For example, a censored 
value Xn is included in the count dr iff t2n < and in the count a,- iff i„ < r. 
From these definitions immediately follows that for any t G dom(F) we have 

that d-r + dr + Ut = N . 

Let kr be the actual number of sample values not exceeding t. Due to the 
presence of the censoring mechanism the value of fcr is only observed to satisfy 
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dr < kr < dr + Ut\ we label the latter event as E. The likelihood of E is given 
by 

//V\ 



Let us define the function F : dom(i^) — > [0, 1] as the value of p that maximizes 



k=dT 



N 



\N-k 



subject to the constraint < p < 1. The value of F{t) has been derived to be 
if c^T- = and Ur > 1 



F{r) = 



1 



1/2 



if ttr = and dr >1 

if U-r = N 



/l I ^.T+l/ ''T(OT + l).-.(aT+MT) ^ _ „, 



Furthermore, the function f can be used as an estimator for F since it is a 
non-decreasing function over dom(i^). 



2 Multivariate extension 

In this section the definition of the estimator F has been extended to the general 
case of a sample of M-variate observations. Let Xi,X2, . . . ,Xn be i.i.d. M- 
vectors with distribution function F and the matrix D be defined as 



/ ^1 \ 



D = 



( Xyy X 



12 



X 



21 



X 



22 



X2M 



\ -X'jv / \ Xni Xn2 ■ ■ ■ Xnm ) 

For the rest of the paper we have assumed that all observations X^m are censored 
into corresponding intervals {Fnm-,Flrmi\ since the treatment of a dataset D 
containing exact in addition to censored observations does not provide any new 
mathematical insight. 

We also adopt the convention that unless explicitly stated otherwise, an 
index represented by a small letter ranges between 1 and the value of the corre- 
sponding capital letter inclusive. For example, me {1,2,..., M}. Furthermore, 
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a random quantity will be always designated by a capital letter and the corre- 
sponding small letter will be reserved for its realization. For example, a;„ is the 
realization of the random vector Xn- 

Let ^1™' be the value of the im-th biggest element, im S {1,2, 
the set 



,Im}, of 



I LNm} U {Rim, -Rlmi ■ ■ RNm} 



and the set G^"*) be defined as 

Qim) ^ 



m) y.(m) 



Consequently, the elements of G'^™^ are all distinct and such that 

Let us also define the grid G as G = G^^) x G^^' x . . . x G^^). Our goal will be 
to estimate F over G. 

Let a; = (xi, X2, . . . , xm) € -R*^ and x' = ix[,X2, . . . , x'^,^) e i?*^. We will 
write cc < a;' iff Xm < x'^. The expressions x > x' , x < x' and a; > a;' are de- 
fined analogously. Let X„ = L„2, • • ■ , ^hm) and fl„ = {Rni,Rn2, Rum)- 
By analogy with the 1-dimensional case we define d{x) as the count of obser- 
vations Xn such that i?„ < x and as the count of observations satis- 
fying Ln < X < Rn- It is important to point out that the count a{x) = 

— d{x) — u{x) is not the number of observations such that Rn < x. Fi- 
nally, let k{x) be the realized value of the actual count of observations such 
that Rn < X and E designate the event d{x) < k{x) < d{x) + u{x). 

Now we can estimate F{x) by the value of the variable p that maximizes the 
function 



L{p;E) = 



d{X)+u(X) 

E 

k=d(X) 



N-k 



subject to the constraint < p < 1. Consequently the estimator F of the 
unknown distribution function F is given by 





F{x) = < 



1 

1/2 



if d{x) = and a{x) > 1 



if a{x) = and d{x) > 1 



if u{x) = N 



{X){a{X) + l)...{d{X)+u{X)) \ 
(X)(d{X)+l)...(d{X)+u{X)) ) 



o.w. 
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We briefly consider once again a sample of univariate observations Xi , . . . , X^r 
with Xn censored into an interval _R„] and assume that the random vectors 
{Xn, Ln, Rn) are all i.i.d. according to some cdf Fxlr- The latter function 
provides a quantitative descsription of the censoring mechanism at play. By 
setting Xn = {Xn, Ln, Rn) and employing the estimation procedure just de- 
scribed we can construct an estimator Fxlr for the unknown function Fxlr 
allowing us to estimate how the censoring mechanism operates. 

3 Kernel density estimation in 1 and 2 dimen- 
sions 

Consider a univariate random sample Z\, Z^,..., from some unknown pdf 
fz and suppose that the corresponding observations are all exact. The kernel 
density estimate fz of fz is defined as 

where h is an appropriately chosen parameter. The rationale for such a con- 
struction is to place a "bump" of size centered over each one of the sample 
values z„. The general shape of each bump is determined by the choice of the 
kernel function K while its spread is controlled by the parameter h. All the 
bumps are set to be of equal size 1/A'' due to the i.i.d. nature of the observa- 
tions. The size of the bump over Zn can be also interpreted as the amount of 
probability assigned over the interval {zn-i, Zn] by the empirical cdf and is 
thus equal to Fz{zn) — ^z(^ri-i)- 

We apply the reasoning from above to the case of a univariate random sam- 
ple Xi, X2, ■ ■ ■ ,Xn from some unknown density function fx such that each 
Xn is censored into some interval (i„, i?„]. The set G is reduced to the set 
X2, . . . ,Xi^ of unique element values of the set 

{Z/i, L2, . . . , Ln} U {-Ri, i?2, • • • , Rn} 

listed in increasing order. We proceed to define the function w ; G — > [0, 1] by 
wiXi) = F{Xi) and = F{Xi) - F{Xi_i) if 2 < i < /. Now we define 

the smoothed density estimator fx as 

fx{x) = lY.'"{Xi)K 
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Next we generalize the latter construction to the case of a random sample 
{{Xn, yn)} of censored 2-dimensional random vectors with unknown p.d.f fxY- 
The set G is given by G = G^^) x G'^y^ where 

G(^) = {XuX2,...,Xi} 

The definition of the function w : G — >■ [0, 1] is extended as follows: w{Xi, Yj) = 
if « = 1 or j = 1. In all other cases w{Xi,Yj) equals the cummulative probabil- 
ity assigned by F over the interior of the rectangle in defined by the points 
{Xi-i,Yj-i), {Xi,Yj-i), {Xi,Yj) and {Xi-i,Yj) along with the line segments 
connecting {Xi,Yj-i) with {Xi,Yj) and Yj) with {Xi,Yj). Consequently, 

the function value 'w{Xi, Yj), 2 < i < I, 2 < j < J, is given by 

w{X,, Yj) = F{Xi, Y^) - F{Xi, f,_i) - F{Xi_„Yj) + F{Xi_„ Yj_,) 

Pseudocode employing the recursive relationship from above to compute the 
weights 'w{Xi,Yj) is provided next: 



FOR 3 = 1: J 
w{XuYj) = Q 
NEXT j 

FOR i = l:I 
w{Xi,Y^)=Q 
NEXT i 



FOR j = 2 : J 
FOR i = 2:I 

wiXi, Yj) = F{Xi, Yj) - F{Xi, Yj_,) - F{Xi_i,Yj) + F{Xi_i,Yj_,) 
NEXT i 
NEXT j 

Having developed a method for computing the weights w{xi,yj) we are ready 
to present the expression for the smoothed density estimator fxY{x,y): 
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4 Kernel method in M dimensions 

We will use X to designate an arbitrary element {Xi^, Xi^, .. ., Xi^^ of the 
grid G. Furthermore, given any vector X € G such that im > 2 for Vro we will 
define the vector 



Xi^-i, . . . , Xi^-i I G G 



Let n(x) be the set of all hyperplanes passing through x and parallel to 
the coordinate planes. For example, fl{X) is the set of all hyperplanes passing 
through X and parallel to the coordinate planes. Define the function w : G ^ 
[0, 1] as follows: 'w{X) = F{X) = if there exists a component X,^ of -X" such 
that irn = 1. In the case when im > "2 for Vm the value of w{X) is given by the 
cummulativo probability assigned by F over the hyperrectanglc in i?*^ bounded 
by the hyperplanes in i}{X ) and ^l{X) but excluding the points lying on the 
hyperplanes in fl{X ). 

Let h= {hi, /i2, • • • , /im)- The smoothed function estimator is given by 



K{x;x,h) = Y[K(^ 



where 



5 A loss function for computing the optimal band- 
width 

Consider once again a univariate random sample Zi, Z2,...,Zn from some 
unknown pdf fz and the kernel density estimator 



Zn 



h 

The bandwidth h will be treated as a variable for the remainder of the section. 

Also, to simplify notation whenever no ambiguity arises we will distinguish 
density functions by their argument only and drop the subscripting random 
variable. For example, /(z) will represent fz{z). In addition, we will use a 
subscript "— n" to indicate that a quantity has been derived based on the subset 
of the original random sample obtained after removing the n-th observation. 
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For example, f^n{z) is the kernel density estimator for f{z) calculated after 
removing Z„ from the original sample. 

The integrated square error is defined as 



dz 



fizfdz ^2 / f{z)f{z)dz+ / fizfdz 



and the value of h minimizing the risk function given by 



Rih) 
Rih) 



dz 



fizfdz \-2E{ / fiz)fiz)dz + / fizfdz 



is generally viewed as the optimal choice for the value of h in fziz;h). The 
term J fizfdz is independent of h and as a result we need to minimize 



E 



fizfdz 



2E 



fiz)fiz)dz 



The latter goal, however, is unachievable since the density fz is unknown. 
In reality we seek to minimize the score function 

Mo(/i) = / fizfdz /-«(^") 

for two reasons. It is straightforward to demonstrate that 



E' 



fiz)fiz)dz 



which immediately implies that 

E{M^ih)}^E{ [ fizfdz 



2E 



fiz)fiz)dz 



In addition, as stated by Silverman [2 "Assuming that the minimizer of M^ih) 
is close to the minimizer of i?{Afo(/i)} indicates why we might hope that mini- 
mizing Mq gives a good choice of smoothing parameter." 

Now we move on to motivate and introduce a score function Mgih) that 
mimics the form of Mo(/i) and can be used in the presence of censoring. We 
begin by defining the random variables 

Ln = max{— oo, i„} 
Rn = min{i?„, +00} 

Vn = -iLn + Rn) 
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If we make the assumption that the probabihty distribution functions g of Vn 
and / of Xn are approximately equal, i.e g{v) « f{v), then we have that 

= Ey f\{v)g{v)dv^ 
« f\{v)f{v)dv^ 



con- 



Since the expected values 1 / f-iix)fx{x)dx^ and -E | / f{x)fx{x)dx^ 
verge asymptotically we can conclude that for large samples 

^{:^E/-"(^")} *^{/ f-i{x)f{x)dx'^ ^ E y 
Consequently, we define MqQi) as 

Mo(/i) = y hxfdx - A ^ 
In M > 2 dimensions we define the random variables 

Lnm = max{ — OO, Lnm} 

Rnm = min{i?„„, +00} 

and the random vector V„ = (V^i, V^2, • • • i ^m)- Under the assumption that 

the probability distribution functions g of V„ and / of X„ are approximately 
equal, i.e g{v) « /(f), and based on identical reasoning we generalize the defi- 
nition of Mo{h) as follows: 

Mo{h) = I fixfdx - I ^/_„(VJ 
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6 Nadaraya Watson regression with censored data 



In regression analysis the goal is to estimate the expected value E {Y\X — x} 
based on a random sample {{Xn, Yn)} from some unknown p.d.f. / where Xn 
is an M-dimensional vector of explanatory variables. Nadaraya and Watson 
[2113] have proposed a non-parametric estimator for E {Y\X = x} derived from 
the kernel density estimator for / in the case when all sample observations are 
exact. We employ the newly developed censoring kernel density estimator 



\m ''"V X 



X 

and an identical pattern of reasoning to adapt the Nadaraya- Watson estimator 
for use with censored data. 

In 1 + 1 dimensions the censoring kernel density estimator can be written as 

f{x,y) = Yl ^(-.,y.)7rV^ (^) ^ ~ 

Consequently 

fix) = / f{x,y)dy 



'Yw{x^,yj)--^K 



h. 



and 



jyf{x,y)dy = ^ ^^.) /^i^ (^) / y (^) 



Now we define the estimator E {Y\X = x] as follows: 



9 



In {M + 1) dimensions the same reasoning leads us to define the estimator 
E{Y\X = x} as 



E{Y\X ^x} = 



J yf{x, y)dy _ J2(x,y,) i^'^ ^) Vj 



fix) E(*,%)"'(*'%)^('^;*'^) 

7 Parameter estimation for a multinomial dis- 
tribution in the presence of censoring 

Let ci and C2 be the respective observed numbers of outcomes of typo 1 and 
type 2 in a binomial experiment with N trials, u = N — ci — C2 > I number of 
trials with unknown outcomes and probability tt of a single trial being of type 1. 
Let -A^i and A^2 designate the actual counts of type 1 and type 2. Consequently 
A''i and N2 are censored such that {Ni, N2) G S2 where the set ^2 is defined by 

'S'2 = {(^1,^2) I h,l2 are non-negative integers, h > ci, I2 > C2, h +I2 = N} 

If E designates the event {Ni, N2) G S then the likelihood of observing E is 
given by 



L{7t;E)^ E 

{ni,n2)eS2 



m 



rill 712! 



ci+u 

E 



(^rd-TT) 



N-m 



As already derived, the value ^ of p that maximizes the function 



//V\ 

L{p;E)^ E L )(pr(i-p) 



N-m 



subject to the constraint < p < 1 is given by 
if 



TT = < 



ci = and C2 > 1 



1 

1/2 



if C2 — and ci > 1 



if u = N 



^ V ci(ci+i)...(ci+u) ; o-^- 



The treatment of an multinomial experiment with N trials, M possible out- 
come types and probabilities 771,772, .. . ,7rM of each outcome type is based on 
the same reasoning. We use ci,C2, ■ . ■ ,cm to designate the observed counts of 
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each type and Ni, N2, ■ ■ ■ , Nm to designate the actual and possibly censored 
outcome counts. Suppose u = N — J2m > 1 and define the vectors 

P = {Pi,P2,---,Pm) 

C = (Ci, C2,...,Cm) 
n = (m, 712, . . . .TIm) 

N = {NuN2,...,Nm) 
The definition of the set S2 generalizes to 

Sm = {ih,l2, ■ ■ ■ , Im) I is a non-negative integer, > Cm, 'Y^lm = N} 

m 

and accordingly E is redefined to be the event (A''i, N2, ■ ■ ■ , Nm) S Sm- The 
likelihood of as a function of p is given by 

^ta^)- E „,„r..„„i fe")-fe)--(")" 

An approximate solution to the resulting estimation problem can be constructed 
as follows. If Pm is the value of the variable Pm that maximizes the function 



then we could employ 



N-n„ 



Pr, 



P1+P2 + ■■■+PM 

as an estimator for the unknown probability tt^. 

Next we consider a trinomial (M = 3) experiment such that U12 trials are of 
type 1 or type 2 and U23 are of type 2 or type 3 and define 

Ml = U12 

U2 = minjA'' — ci — C2 — C3, U12 + U23} 
U3 = W23 

Let Pra be the value of the variable Pm that maximizes the function 



\N-n„ 



and 

Pr, 



Pi + P2 + ■ ■ ■ + Pm 
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The quantities tt* , ^2 and Tig can be used to estimate the unknown probabihties 
TTi, TT2 and TTg. GeneraUziug to the ease of M possible outcomes in the presence 
of partial censoring is straightforward. Let Um be the maximum possible number 
of censored outcomes of type m and assume that 1,2,... all possible 

counts for the number of unobserved outcomes of type m. Consequently tt^ is 
a potential estimator for iTm- 

So far we have been constructing likelihood functions without making as- 
sumptions or having the benefit of prior knowledge about the nature of the 
censoring mechanism. Let Qm be the conditional probability of observing an 
outcome ot type m and q = {qi, 92, • • • ,9m)- For example, let us consider a 
binomial (M = 2) experiment with known parameters qi and 92. The proba- 
bility of not being able to observe the outcome of a single trial X„ is given by 
(1 — qi)pi + (1 - 92)^2 = {pi + P2) - Piqi - P2q2- Consequently the likelihood 
of observing ci outcomes of type 1, C2 outcomes of type 2 and u = N — ci — C2 
outcomes of unknown type is 
TV' 

L{p, q;c, u) = ] (pi qif' {j>2q2Y' [(1 - qi)pi + (1 - 92)^2]" 
C1IC2I u\ 

Generalizing is trivial: 

~ 1m) Pr, 



Hp, q; c, u) = — — j j— {pm q-mY 



ci! C2! . . . Cm! u\ 

where u = N - c„- 

Finally we turn our attention to a binomial experiment such that qi remains 
unknown but 92 is known. The outcome x„ of a single trial X„ can be classified 
in exactly one of the following four categories: observed of type 1, observed of 
type 2, unobserved of type 1 and unobserved of type 2. Let Ni designate the 
number of censored outcomes of type 1, N2 designate the number of censored 
outcomes of type 2 and the set S2 be defined as 

S2 = {(^1,^2) \h, h are non-negative integers, l\ + I2 = N — ci — 02} 

The likelihood of the event E = "{Ni, N2) E S2' is given by 

■«r^ ATI 

L(7r, q;E)= ^ , , J , . , {nmr (^292)^=^ [(1 - (?i)7ri]"^ [(1 - q2)n2r 
, '-^ , c\\ C2I nil n2l 

(ni,n2) 

where the summation index (ni, 712) spans the set ^2. Consequently we seek to 
maximize the function 

Lip, q'2;E)= ^ .^f:,,. , ipiqir fe?^)^^ [(1 - [(1 - g^)P2]"= 

ci\ C2\ nil n2'. 

(ni,n2) 

subject to the constraints pi +P2 = ^ and < g'2 ^ 1- 
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8 Analysis of contingency tables with incom- 
plete counts 



Since each cell in an / x J contingency table can be uniquely associated with 
an ordered pair the set of ordered pairs constitutes the space of 

possible outcomes for a sample random variable X„. Define the probabilities 
T^ijt Qij and a^j as 

Wij = Prob{X„ = (i,j)} 

qij = Prob{X„ is observed I X„ = 

aij = Prob{X„ = and X„ is observed} = iTijQij 

Furthermore, let Cjj and Nij be the respective observed and actual counts in cell 
We can quantify the effect of the censoring mechanism by observing that 
the ratio aij = ^ constitutes an MLE for the joint probability a,j and using 
the plug-in principle within the equation a^j = i^ijQij to obtain the estimator 
(iij = f'^i' the unknown probability qij. 

The actual count Nij may be unknown due to the censoring mechanism. 
Prom the definitions follows that Cij = Nij if outcomes of type are not 
subject to censoring and Cij < Nij otherwise. Finally, let us use Nj = ^ ■ Nij 
to designate the j-th column total and in the case when Nj is known let Uj = 
Nj — J2i '^ij designate the number of sample outcomes censored into the j-th 
column. 

Consider the special case of a 2 x 2 (7 = 2, J = 2) contingency table and the 
null hypothesis 

Ho : Prob {X„ = (1, 1) | X„ G {(1, 1), (2, 1)}} = Prob{X„ = (1, 2) | X„ e {(1, 2), (2, 2)}} 
which can be rewritten as 

-no : ■ = ■ 

TTll + 7r21 7ri2 + 7r22 

Assuming Hq in an estimation problem amounts to introducing the constraint 

Pll _ Pll 

PU + P21 Pl2 + P22 

where pij is the variable associated with the unknown cell probability TTjj . In the 
special case of predetermined column totals A''i and N2 we have that tth +7ri2 = 
1 as well as 7ri2 + 7r22 = 1- Consequently, the null hypothesis is reduced to 
-f^o : TTii = 7ri2 = TT and accordingly the null constraint becomes pi\= P22 = P- 
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Before turning our attention to three examples of censored 2x2 contingency 
tables we introduce some additional notation: 



S = {{hi, hi, h2, I22) I kj is a non-negative integer, > c^, ^ kj = N} 

N = {Nn, N21, N22) 

n = (nil, n2i, ni2, ^22) 

c = (cii, C21, C12, C22) 

P = {Pii, P21, P12, P22) 

C = (Cii, C21, C12, C22) 

In each example we construct the likchhood necessary to derive a set of estima- 
tors {nij} for the elements of {nij}. A superscript "(0)" will be used to label 
quantities derived under Hq. For example, tt^j^ is the null esimator for Tr^. 

8.1 Example 1 

Suppose that A''i and N2 are predetermined by the experimenter, the counts Nu 

and N21 are exact implying ui = while the counts A^i2 are N22 arc censored 
implying U2 > 1- Let Ei designate the event "N G S and A^n + N21 = Ni and 
A^i2 + N22 = N2' ■ The likelihood of observing Ei is given by 

L{p; El) = L2{pi2) 

where 



Nil 

Li{pi2) = E f„/J(pi2rMi-Pi2)^=-"- 

"12=C12 

Since the column totals A''i and N2 are fixed and known in advance, under Hq 
the likelihood function needs to be modified by setting p = pu — px2'- 

"12=Ci2 ^ ^^'^ \'^12/ 



P) 



JV-JVii-ni2 
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Let t = {Nil, Ci2, C22, U2) be a particular vector of counts for the contin- 
gency table. Then the probability of observing t is given by 

Prob{t} = Pi{Nii) P2{ci2, C22, U2) 

where 



N2 - - 

P2{ci2, C22, U2) = _ ,_ , _ , {ai2y^^a22y^^ {1 - ai2 - 022)" 

Cl2'- C22! U2'. 



We can estimate Prob{t} by using tth, 7721, 0112 and 0122 for the unknown prob- 
abilities TTii, 7721, 0^12 and 0:22- Under Hq we estimate Prob{t} by employing 
the appropriate null estimators n^i and n^i as opposed to tth and 'k2i- 

8.2 Example 2 

Suppose A^i and N2 are predetermined by the experimenter and the counts 
Nil, N21, N12, N22 are all unobserved. We use E2 designate the event "AT e S 
and Nil + N21 = Ni and N12 + N22 = N2" . The Ukelihood of E2 is given by 

L{p; E2) = ^2(^12) 

where 



"11=C11 

C12+M2 



cl2-t-"2 /AT \ 
"12 = C12 ^ ' 

The two factors Li{pii;E) and L2{pi2',E) can be maximized independently if 
no further assumptions are made. Under the null constraint p = pn = pi2 the 
likelihood L{p;E2) is modified as follows: 

L{p;E2,Ho) = V f^'V^rUl-p)^^-"" f^'Vp)"-(l-pr-"- 

L{p;E2,Ho) = r^'jr^'Jbr^+^-ll-p)^-"--"- 

("11, "12) 

where (rin, 7112) G S' and nu + n2i = Ni and ni2 + 7122 = -^2- 
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The probability of a particular contingency table configuration is given by 
Prob{c, ui, U2} = Pi(cii, C21, ■ui)P2(ci2, C22, U2) 

with 

iVi - - 

Pi(cii, C21, ui) = _ , - , - , (Q2i)"^^ (1 - an - Q2i)"^ 

cii! C21! Ml! 



No - - 

-P2(C12, C22, M2) = _ , - , - , (a22)"^ (1 - ^12 - 022)"^ 

C12! C22! ^2! 



The estimators for atj remain unchanged under Hq unless additional assump- 
tions are made regarding the nature of the censoring mechanism. 

8.3 Example 3 

Suppose that Nn, N21, N12, N22 as well as the column totals A^i and N2 are 
all unknown. Let u = N — (cn + C21 + C12 + C22) and E3 designate the event 
"iV G S" . The estimators TTy maximize the likelihood 

Lip;Es) = ^ ^' , (pu)"- (P2i)"- (P12)"- (P22)"- 

nil! n2i! ni2! n22! 

By enforcing the null constraint the above likelihood is reduced to 

Lip;Es,Ho) = if'. — r(pr"(i-prMprMi-pr' 

nii!n2i!ni2!n22! 

EiV' 
-— (1 -p)"2i+n22 
nil! n2i!ni2! 7122! 

The probability of a particular contingency table configuration is given by 

Prob{c, u} = _ Pi{c) P2{u) 

Cii!c2l!ci2!c22!w! 

where 

Pi(c) = (aii)^"" (a2i)^- (ai2)^^Ma22)'^^ 

P2{u) = (1 - an - Q!2i - q;i2 - 022)" 
Assuming Hq does not modify the estimate for Probjc, u}. 

Extending the ideas presented in this section to the construction of appro- 
priate likelihood functions for contingency tables with I > 2 rows and J > 2 
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columns in the presence of a censoring mechanism should be trivial in most 
cases. Solving the resulting optimizataion problems, however, may be far from 
straightforward . 
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